Experiments with PageRank Computation

نویسندگان

  • Ashraf Khalil
  • Yong Liu
چکیده

PageRank algorithm is one of the most commonly used algorithms that determine the global importance of web pages. Due to the size of web graph which contains billions of nodes, computing a PageRank vector is very computational intensive and it may takes any time between months to hours depending on the efficiency of the algorithm. This promoted many researchers to propose techniques to enhance the PageRank algorithm. The researchers investigated all aspects of PageRank algorithm that covers stability, convergence speed, memory consumption, I/O efficiency, and the connectivity matrix properties [2-7, 11]. However, some aspects of PageRank algorithm are left unstudied. In addition, very few techniques are building on the results of the others. In this work we investigate some PageRank properties and report our findings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PageRank Computation and the Structure of the Web: Experiments and Algorithms

We describe some computational experiments carried out with variants of the “PageRank” model due to Page et al., with particular reference to the small and large scale structure of the Web.

متن کامل

Traps and Pitfalls of Topic-Biased PageRank

We discuss a number of issues in the definition, computation and comparison of PageRank values that have been addressed sparsely in the literature, often with contradictory approaches. We study the difference between weakly and strongly preferential PageRank, which patch the dangling nodes with different distributions, extending analytical formulae known for the strongly preferential case, and ...

متن کامل

Web-Site-Based Partitioning Techniques for Reducing the Preprocessing Overhead before the Parallel PageRank Computations

The efficiency of the PageRank computation is important since the constantly evolving nature of the Web requires this computation to be repeated many times. Due to the enormous size of the Web’s hyperlink structure, PageRank computations are usually carried out on parallel computers. Recently, a hypergraph-partitioning-based formulation for parallel sparse-matrix vector multiplication is propos...

متن کامل

Do Your Worst to Make the Best: Paradoxical Effects in PageRank

Deciding which kind of visit accumulates high-quality pages more quickly is one of the most often debated issue in the design of web crawlers. It is known that breadth-first visits work well, as they tend to discover pages with high PageRank early on in the crawl. Indeed, this visit order is much better than depth first, which is in turn even worse than a random visit; nevertheless, breadth-fir...

متن کامل

Paradoxical Effects in PageRank Incremental Computations

Deciding which kind of visiting strategy accumulates high-quality pages more quickly is one of the most often debated issues in the design of web crawlers. This paper proposes a related, and previously overlooked, measure of effectivity for crawl strategies: whether the graph obtained after a partial visit is in some sense representative of the underlying web graph as far as the computation of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004